Variable Selection via Penalized Likelihood
نویسندگان
چکیده
Variable selection is vital to statistical data analyses. Many of procedures in use are ad hoc stepwise selection procedures, which are computationally expensive and ignore stochastic errors in the variable selection process of previous steps. An automatic and simultaneous variable selection procedure can be obtained by using a penalized likelihood method. In traditional linear models, the best subset selection and stepwise deletion methods coincide with a penalized leastsquares method when design matrices are orthonormal. In this paper, we propose a few new approaches to selecting variables for linear models, robust regression models and generalized linear models based on a penalized likelihood approach. A family of thresholding functions are proposed. The LASSO proposed by Tibshirani (1996) is a member of the penalized leastsquares with the L1-penalty. A smoothly clipped absolute deviation (SCAD) penalty function is introduced to ameliorate the properties of L1-penalty. A uni ed algorithm is introduced, which is backed up by statistical theory. The new approaches are compared with the ordinary leastsquares methods, the garrote method by Breiman (1995) and the LASSO method by Tibshirani (1996). Our simulation results show that the newly proposed methods compare favorably with other approaches as an automatic variable selection technique. Because of simultaneous selection of variables and estimation of parameters, we are able to give a simple estimated standard error formula, which is tested to be accurate enough for practical applications. Two real data examples illustrate the versatility and e ectiveness of the proposed approaches.
منابع مشابه
Penalized Bregman Divergence Estimation via Coordinate Descent
Variable selection via penalized estimation is appealing for dimension reduction. For penalized linear regression, Efron, et al. (2004) introduced the LARS algorithm. Recently, the coordinate descent (CD) algorithm was developed by Friedman, et al. (2007) for penalized linear regression and penalized logistic regression and was shown to gain computational superiority. This paper explores...
متن کاملPenalized Empirical Likelihood and Growing Dimensional General Estimating Equations
When a parametric likelihood function is not specified for a model, estimating equations provide an instrument for statistical inference. Qin & Lawless (1994) illustrated that empirical likelihood makes optimal use of these equations in inferences for fixed (low) dimensional unknown parameters. In this paper, we study empirical likelihood for general estimating equations with growing (high) dim...
متن کاملA Connection Between Variable Selection and EM-Type Algorithms
Variable selection is fundamental to high-dimensional statistical modeling. Fan and Li (2001) proposed a class of variable selection procedures via nonconcave penalized likelihood. Optimizing the penalized likelihood function is challenging as it is a highdimensional nonconcave function with singularities. A new algorithm is proposed for finding a solution of the nonconcave penalized likelihood...
متن کاملDimension Reduction and Variable Selection in Case Control Studies via Regularized Likelihood Optimization
Dimension reduction and variable selection are performed routinely in case-control studies, but the literature on the theoretical aspects of the resulting estimates is scarce. We bring our contribution to this literature by studying estimators obtained via l1 penalized likelihood optimization. We show that the optimizers of the l1 penalized retrospective likelihood coincide with the optimizers ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999